feat: Add int8 sd conversion function for aiu#95
Merged
chichun-charlie-liu merged 13 commits intofoundation-model-stack:mainfrom May 9, 2025
Merged
feat: Add int8 sd conversion function for aiu#95chichun-charlie-liu merged 13 commits intofoundation-model-stack:mainfrom
chichun-charlie-liu merged 13 commits intofoundation-model-stack:mainfrom
Conversation
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
dc803bb to
60f1aa0
Compare
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Collaborator
Author
|
by adding SAWB recomputation in the presence of narrow INT weights distributions, this PR also addresses issue #109 |
Collaborator
Author
|
needs testing and unit testing... |
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Collaborator
Author
|
The PR has been tested with multiple INT8 quantization configurations and is ready to be merged. Unit tests are missing, will be added at a later date (cc: @BrandonGroth) Conversion of a checkpoint trained with smoothquant is not supported when combined with SAWB-based weights recomputation. This feature will be implemented as part of issue #112. The feature is not strictly needed for INT8 RoBERTa, where smoothquant-free quantization configurations perform well. It may be needed for INT8 LLMs enablement. |
Closed
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
chichun-charlie-liu
approved these changes
May 9, 2025
09c7761
into
foundation-model-stack:main
11 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of the change
This PR introduces a conversion function for the state dictionary of an INT8-quantized model created with fms-mo.
By calling
save_sd_for_aiu, a new state dictionary which complies with AIU requirements is generated and saved.This new state dictionary / checkpoint can be loaded using fms
get_modelfunction in combination with the INT8 add-ons already present in fms-mo (seefms_mo/aiu_addons/).The following processing steps are taken by this conversion function:
Was the PR tested